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BACTERIA WITH REDUCED GENOME 

CROSS-REFERENCE TO RELATED APPLICATIONS 
[0001] Not applicable. 

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH 

OR DEVELOPMENT 

[0002] This invention was made with United States government support awarded by the 

following agency: 

NIH GM35682 
The United States has certain rights in this invention. 

BACKGROUND OF THE INVENTION 

[0003] Bacteria have been used to produce a wide range of commercial products. For 

example, many Streptomyces strains and Bacillus strains have been used to produce antibiotics; 
Pseudomonas denitrificans and many Propionibacterium strains have been used to produce 
vitamin B12; some other bacteria have been used to produce vitamin Riboflavin; Brevibacterium 
flavum and Corynebacterium glutamicum have been used to produce lysine and glutamic acid, 
respectively, as food additives; other bacteria have been used to produce other amino acids used as 
food additives; Alcaligenes eutrophas has been used to produce biodegradable microbial plastics; 
and many Acetobacter and Gluconobacter strains have been used to produce vinegar. More 
recently, it has become common for bacteria, such as Escherichia coli (E. coli), to be genetically 
engineered and used as host cells for the production of biological reagents, such as proteins and 
nucleic acids, in laboratory as well as industrial settings. The pharmaceutical industry supports 
several examples of successful products which are human proteins which are manufactured in E. 
coli cultures cultivated in a fermenter. 

[0004] It is not an uncommon occurrence for normal bacterial proteins to adversely affect 

the production or the purification of a desired protein product from an engineered bacteria. For 
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example, when E. coli bacteria are used as host cells to generate in large quantity of a desired 
product encoded by a gene that is introduced into the host cells by a plasmid, certain normal E. 
coli gene products can interfere with the introduction and maintenance of plasmid DNA. More 
significantly, because of the economies of bacterial culture in making proteins in bacteria, often 
the cost of purification of a recombinant protein can be more than the cost of production, and 
some of the natural proteins produced by the bacterial host are sensitive purification problems. 
Many bacterial strains produce toxins that must be purified away from the target protein being 
produced and some strains can produce, by coincidence, native proteins that are close in size to the 
target protein, thereby making size separation not available for the purification process. 

[0005] Also, however, the genome of a bacteria used in a fermenter to produce a 

recombinant protein includes many unnecessary genes. A bacteria living in a natural environment 
has many condition responsive genes to provide mechanisms for surviving difficult environmental 
conditions of temperature, stress or lack of food source. Bacteria living in a fermentation tank do 
not have these problems and hence do not require these condition responsive genes. The bacterial 
host spends metabolic energy each multiplication cycle replicating these genes. Thus the 
unnecessary genes and the unneeded proteins, produced by a bacterial host used for production of 
recombinant protein, simply represent lack of efficiencies in the system that could be improved 
upon. 

[0006] It is not terribly difficult to make deletions in the genome of a microorganism. One 

can perform random deletion studies in organisms by simply deleting genomic regions to study 
what traits of the organism are lost by the deleted genes. It is more difficult, however, to make 
targeted deletions of specific regions of genomic DNA and more difficult still if one of the 
objectives of the method is to leave no inserted DNA, here termed a "scar," behind in the 
organism after the deletion. If regions of inserted DNA, i.e. scars, are left behind after a genomic 
deletion procedure, those regions can be the locations for unwanted recombination events that 
could excise from the genome regions that are desirable or engender genome rearrangements. 
Since in building a series of multiple deletions, scars left behind in previous steps could become 
artifactual targets for succeeding steps of deletion. This is especially so when the method is used 
repeatedly to generate a series of deletions from the genome. In other words, the organism 
becomes by the deletion process genetically unstable if inserted DNA is left behind. 
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BRIEF SUMMARY OF THE INVENTION 

[0007] The present invention provides a bacterium having a genome that is genetically 
engineered to be at least two percent (2%) to fifteen percent (15%) smaller than the genome of its 
native parent strain. When used to produce a product, a bacterium with a smaller genome can 
have one or more of the following advantages. One, the production process can be more efficient 
either in terms of resource consumption or in terms of production speed, or both. Two, the 
product purification process can be simplified or purer products can be made. Three, a product 
that cannot be produced before due to native protein interference can be produced. 

[0008] To make a bacterium with a smaller genome, genes and other DNA sequences that 

are not required for cell survival and protein production in culture can be deleted. 

[0009] The present invention also provides methods for targeted deletion of genes and other 

DNA sequences from a bacterial genome without leaving any residual DNA from the 
manipulation. Since the methods of the present invention seldom introduce mutations into the 
genomic DNA sequences around deletion sites, the methods can be used to generate a series of 
deletions in a bacterium without increasing the possibility of undesired homologous 
recombination within the genome. Some of these methods are also useful for similar deletions in 
higher organisms. 

[0010] The first method is linear DNA-based. To perform the process, first, a linear DNA 
construct is provided in a bacterium and a region of the bacterial genome is replaced by the linear 
DNA construct through homologous recombination aided by a system resided in the bacterium 
that can increase the frequency of homologous recombination. Next, a separate gene previously 
introduced into the bacterium expresses a sequence-specific nuclease to cut the bacterial genome 
at a unique recognition site located on the linear DNA construct. Then, a DNA sequence 
engineered into one end of the linear DNA construct undergoes homologous recombination with a 
similar genomic DNA sequence located close to the other end of the linear DNA construct. The 
net result is a precise deletion of a region of the genome. 

[0011] The second method is also linear DNA-based. Two DNA sequences, one of which 

is identical to a sequence that flanks one end of a bacterial genome region to be deleted and the 
other of which is identical to a sequence that flanks the other end of the bacterial genome region to 
be deleted, are engineered into a vector in which the two sequences are located next to each other. 
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At least one sequence-specific nuclease recognition site is also engineered into the vector on one 
side of the two sequences. The vector is introduced into a bacterium and a linear DNA is 
generated inside the bacterium by expressing inside the bacterium a nuclease that recognizes the 
sequence-specific nuclease recognition site and cuts the vector therein. The linear DNA 
undergoes homologous recombination with the bacterial genome aided by a system resided in the 
bacterium to increase the frequency of homologous recombination. A bacterium with a targeted 
deletion in its genome is thus produced. 

[0012] The second method described above can also be used to replace a selected region of 
a bacterial genome with a desired DNA sequence. In this case, a desired DNA sequence that can 
undergo homologous recombination with and hence replace the selected region is engineered into 
the vector. All other aspects are the same as for deleting a targeted region. 

[0013] The third method is suicide plasmid-based. The specific plasmid used in this 

method contains an origin of replication controlled by a promoter and a selectable marker, such as 
an antibiotic resistance gene. To delete a targeted region of a bacterial genome, a DNA insert that 
contains two DNA sequences located right next to each other, one of which is identical to a 
sequence that flanks one end of a bacterial genome region to be deleted and the other of which is 
identical to a sequence that flanks the other end of the bacterial genome region, is inserted into the 
plasmid. The plasmid is then introduced into the bacteria and integrated into the bacterial 
genome. Next, the promoter is activated to induce replication from the ectopic origin introduced 
into the bacterial genome so that recombination events are selected. In many bacteria, the 
recombination events will result in a precise deletion of the targeted region of the bacterial 
genome and these bacteria can be identified. An alternative way to select for recombination 
events is to engineer a recognition site of a sequence-specific nuclease into the specific plasmid 
and cut the bacterial genome with the sequence-specific nuclease after the plasmid has integrated 
into the bacterial genome. 

[0014] The suicide plasmid-based method described above can also be used to replace a 

selected region of a bacterial genome with a desired DNA sequence. In this case, a DNA insert 
that contains a desired DNA sequence that can undergo homologous recombination with and 
hence replace the selected region is inserted into the plasmid. All other aspects are the same as for 
deleting a targeted region. 



[0015] Other objects, features and advantages of the invention will become apparent upon 

consideration of the following detailed description. 

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS 

[0016] Fig. 1 shows positions of the genes and other DNA sequences on E. coli K-12 

bacterial genome that were candidates for deletion as black and lighter hatched boxes on the 
outermost ring. 

[0017] Fig. 2 illustrates a specific example of a linear DNA-based scarless genetic 

modification method of the present invention. 

[0018] Fig. 3 illustrates a specific example of another linear DNA-based method of the 

present invention. 

[0019] Fig. 4 shows a mutagenesis plasmid that can be used in the linear DNA-based 
method illustrated in Fig. 3. 

[0020] Fig. 5A-C illustrates a specific example of a suicide plasmid-based method of the 

present invention. 

[0021] Fig. 6 shows three plasmids that can be used in the suicide plasmid-based method 

illustrated in Fig. 5A-C. 

DETAILED DESCRIPTION OF THE INVENTION 

[0022] Bacteria in their natural environment are exposed to many conditions that are not 

normally experienced in standard industrial or laboratory growth, and thus carry a large number of 
condition-dependent, stress-induced genes or otherwise nonessential genes which may not be 
needed in industrial or laboratory use of the organisms. This invention began with the realization 
that much of the genetic information contained within the genome of a bacteria strain could be 
deleted without detrimental effect to use of bacteria cultures in processes of industrial or 
laboratory importance. It was recognized that a bacterium with a reduced genome might be 
advantageous over native strains in many industrial and laboratory applications. For example, a 
bacterium with a reduced genome is at least somewhat less metabolically demanding and thus can 
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produce a desired product more efficiently. In addition, a reduced genome can lead to fewer 
native products and lower level of certain native proteins, allowing easier purification of a desired 
protein from the remaining bacterial proteins. Furthermore, some bacterial genetic sequences are 
associated with instabilities that can interfere with standard industrial or laboratory practices, and 
might entail costly and burdensome quality control procedures. 

[0023] The present invention also involves several methods for deleting genomic DNA 

from a genome without leaving any inserted DNA behind. If one is making several sequential 
deletions from the single DNA molecule which makes up a bacterial genome, it is important not to 
leave any inserted DNA sequences behind. Such inserted sequences, if they were left behind, 
would be candidate sites for undesired recombination events that would delete uncharacterized 
and perhaps important portions of the remaining genome from the bacteria or cause other 
unanticipated genome rearrangements with untoward effects. Since one of the objectives of the 
genome reduction effort is to increase the genetic stability of the bacteria, leaving any inserted 
DNA behind would be contrary to the objective, and should be avoided. Thus the methods used to 
delete DNA from the genome become important and sophisticated. 

[0024] In one aspect, the present invention relates to a bacterium having a genome that is 

genetically engineered to be smaller than the genome of its native parent strain. For exemplary 
purposes, the work described here has focused on the common laboratory and industrial bacterium 
Escherichia coli. The genome reduction work described here began with the laboratory E. coli 
strain K12, which had prior to the work described here, a genome of 4,639,221 nucleotides or base 
pairs. The bacterium of the present invention can have a genome that is at least two percent (2%), 
preferably over five percent (5%), and as much as 14% to 16%, smaller than the genome of its 
native parental strain. We have so far reduced the genome of E. coli K12 by about eight percent 
(8%), without disabling the bacteria from its protein production utility. The term "native parental 
strain" means a bacteria strain found in natural or native environment as commonly understood by 
the scientific community and on whose genome a series of deletions can be made to generate a 
bacterial strain with a smaller genome. The percentage by which a genome has become smaller 
after a series of deletions is calculated by dividing "the total number of base pairs deleted after all 
of the deletions" by "the total number of base pairs in the genome before all of the deletions" and 
then multiplying by 100. 
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[0025] Generally speaking, the types of genes, and other DNA sequences, that can be 

deleted are those the deletion of which does not adversely affect the rate of survival and 
proliferation of the bacteria under specific growth conditions. Whether a level of adverse effect is 
acceptable depends on a specific application. For example, a 30% reduction in proliferation rate 
may be acceptable for one application but not another. In addition, adverse effect of deleting a 
DNA sequence from the genome may be reduced by measures such as changing culture 
conditions. Such measures may turn an unacceptable adverse effect to an acceptable one. 

[0026] Below, E. coli is used as an example to illustrate the genes and other DNA 

sequences that are candidates for deletion in order to generate a bacterium that can produce a 
desired product more efficiently. The general principles illustrated and the types of genes and 
other DNA sequences identified as candidates for deletion are applicable to other bacteria species 
or strains. It is understood that genes and other DNA sequences identified below as deletion 
candidates are only examples. Many other E. coli genes and other DNA sequences not identified 
may also be deleted without affecting cell survival and proliferation to an unacceptable level. 

[0027] It is assumed in the analysis and methodology described below that the DNA 

sequence of the target bacterial strain is available. The full genomic sequence of several strains of 
E. coli is, of course, now published (for example, Blattner et al, Science, 277:1453-74, 1997; 
Perna et al, nature, 409, 529-533, 2001; Hayashi et al, DNA Res., 8, 1 1-22, 2001), as is the 
sequence of several other commonly used laboratory bacteria. To start the deletion process, the 
genome of the bacteria is analyzed to look for those sequences that represent good candidates for 
deletion. Of course, these techniques can also be applied to partially sequenced genomes in the 
genomic areas for which sequence date is available or could be determined. 

[0028] In E. coli, and other bacteria as well, as well as in higher organisms, a type of DNA 
sequence that can be deleted includes those that in general will adversely affect the stability of the 
organism or of the gene products of that organism. Such elements that give rise to instability 
include transposable elements, insertion sequences, and other "selfish DNA" elements. For 
example, insertion sequence (IS) elements and their associated transposes are often found in 
bacterial genomes, and thus are targets for deletion. IS sequences are common in E. coli, and all 
of them may be deleted. For purposes of clarity in this document, we use the term IS element 
generically to refer to DNA elements, whether intact or defective, that can move from one point to 



another in the genome. An example of the detrimental effects of IS elements in science and 
technology is the fact that they can hop from the genome of the host E. coli into a B AC plasmid 
during propagation for sequencing. Many instance are found in the human genome and other 
sequences in the GenBank database. This artifact could be prevented by deletion from the host 
cells of all IS elements. For a specific application, other specific genes associated with instability 
may also be deleted. 

[0029] Shown in Fig. 1 is illustration of the E. coli genome, which natively, in the K12 

strain, comprises 4,639,221 base pairs. Fig. 1, shows, on the inner ring, the scale of the base pair 
positions of the E. coli K12 genome (strain MG1655), scaled without deletions. The next ring 
progressively outward shows regions of the K12 genome that are missing or highly altered in a 
related strain 0157:H7, and which are thus potentially deletable from the K12 genome. The next 
ring outward shows the positions of the IS elements, both complete and partial, in the native 
genome. The next ring moving outward shows the positions of the RHS elements A to E and 
flagellar and restriction regions specially targeted for deletion here. The outermost ring shows the 
location of the deletions actually made to the genome, as also listed in Tables 1 and 2 below. 
These deletions make up about 14 percent of the base pairs in the original K12 MG 1655 genome. 

[0030] Another family of E. coli genes that can be deleted is the flagella gene family. 

Flagella are responsible for motility in bacteria. In natural environments, bacteria swim to search 
for nutrients. In cultured environments, bacteria motility is not important for cell survival and 
growth and the swimming action is metabolically very expensive, consuming over 1% of the 
cellular energy to no benefit. Thus, the flagella genes may be deleted in generating a bacterium 
with a smaller genome. Positions of flagella genes on an E. coli genome map are shown in Fig. 1 
and Table 1. 

[0031] Another family of E. coli genes that can be deleted is the restriction modification 

system and other nucleases whose products destroy foreign DNA. These genes are not important 
for bacterial survival and growth in culture environments. These genes can also interfere with 
genetic engineering by destroying plasmids introduced into a bacterium. Thus, these genes can be 
deleted in generating a bacterium with a smaller genome. Positions of restriction modification 
system genes on an E. coli genome map are shown in Fig. 1 and Table 1 . 



[0032] One type of E. coli DNA element, already mentioned, that can be deleted is the IS 

elements. IS elements are not important for bacteria survival and growth in a cultured 
environment and are known to interfere with genome stability. Thus, the IS elements can be 
deleted in generating a bacterium with a smaller genome. Positions of the IS elements on an E. 
coli genome map are shown in Fig. 1 and Table 1. 

[0033] Another type of E. coli DNA element that can be deleted is the Rhs elements. All 

Rhs elements share a 3.7 Kb Rhs core, which is a large homologous repeated region (there are 5 
copies in E. coli K-12) that provides a means for genome rearrangement via homologous 
recombination. The Rhs elements are accessory elements which largely evolved in some other 
background and spread to E. coli by horizontal exchange after divergence of E. coli as a species. 
Positions of the Rhs elements on an E. coli genome map are shown in Fig. 1 and Table 1 . 

[0034] One type of region in the E. coli genome that can be deleted is the non-transcribed 

regions because they are less likely to be important for cell survival and proliferation. Another 
type of regions in the E. coli genome that can be deleted is the hsd regions. The hsd regions 
encode for the major restriction modification gene family which has been discussed above. 
Positions of the non-transcribed regions and the hsd regions on an E. coli genome map are shown 
in Fig. 1 and Table 1. 

[0035] One general method to identify additional genes and DNA sequences as deletion 

candidates is to compare the genome of one bacterial strain to another. Any DNA sequences that 
are not present in both strains are less likely to be functionally essential and thus can be used for 
identifying candidates for deletion. In the examples described below, the complete genomic 
sequences of two E. coli strains, 0157:H7 EDL933 and K-12 MG1655, were compared. DNA 
sequences that were not found in both strains were used to identify targets for deletion. Twelve 
such identified targets from E. coli strain MG1655 were deleted, resulting in a bacteria strain with 
a genome that is about 8% smaller. The bacteria with the reduced genome are alive and grow at 
substantially the same rate as the native parent MG1655 strain. 

[0036] One can test the consequence of deleting one or several genes or other DNA 

sequences from the genome. For example, after one or several genes or other DNA sequences of 
the genome have been deleted, one can measure the survival and proliferation rate of the resultant 
bacteria. Although most of the above-identified genes or other DNA sequences may be deleted 
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without detrimental effect for purpose of producing a desired product, it is possible that the 
deletion of a specific gene or other DNA sequence may have an unacceptable consequence such as 
cell death or unacceptable level of reduction in proliferation rate. This possibility exists because 
of redundancies in gene functions and interactions between biological pathways. Some deletions 
that are viable in a strain without additional deletions will be deleterious only in combination with 
other deletions. The possibility exists also because of certain methods used to identify deletion 
candidates. For example, one method used to identify deletion candidates is to compare two E. 
coli strains and select genes or other DNA sequences that are not present in both strains. While 
the majority of these genes and other DNA sequences are not likely to be functionally essential, 
some of them may be important for a unique strain. Another method used to identify deletion 
candidates is to identify non-transcribed regions and the possibility exists that certain non- 
transcribed regions may be important for genome stability. 

[0037] The consequence of deleting one or several genes or other DNA sequences to be 

tested depends on the purpose of an application. For example, when high production efficiency is 
the main concern, which is true for many applications, the effect of deletions on proliferation rate 
and medium consumption rate can be the consequence tested. In this case, the consequence tested 
can also be more specific as the production speed and quantity of a particular product. When 
eliminating native protein contamination is the main concern, fewer native proteins and lower 
native protein levels, or the absence of a specific native protein, can be the consequence tested. 

[0038] Testing the consequence of deleting a gene or other DNA sequence is important 

when little is known about the gene or the DNA sequence. Though laborious, this is another 
viable method to identify deletion candidates in making a bacterium with a reduced genome. This 
method is particularly useful when candidates identified by other methods have been deleted and 
additional candidates are being sought. 

[0039] When the consequence of deleting a gene or other DNA sequence has an effect on 

the viability of the bacteria under a set of conditions, one alternative to not deleting the specific 
gene or other DNA sequence is to determine if there are measures that can mitigate the detrimental 
effects. For example, if deleting lipopolysaccharide (LPS) genes results in poor survival due to 
more porous cellular membranes caused by the absence from the cellular membranes of the 
transmembrane domain of the LPS proteins, culture conditions can be changed to accommodate 
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the more porous cellular membranes so that the bacteria lacking the LPS genes can survive just as 
well as the bacteria carrying the LPS genes. 

[0040] Methods for deleting DNA sequences from bacterial genomes that are known to one 
of ordinary skill in the art can be used to generate a bacterium with a reduced genome. Examples 
of these methods include but are not limited to those described in Posfai, G. et al., J. Bacterial. 
179: 4426-4428 (1997), Muyrers, J.P.P. et al., Nucl. Acids Res. 27:1555-1557 (1999), Datsenko, 
K.A. et al., Proc. Natl. Acad. Sci. 97:6640-6649 (2000) and Posfai, G. et al, Nucl. Acids Res. 27: 
4409-4415 (1999), all of which are hereby incorporated by reference in their entirety. Basically, 
the deletion methods can be classified to those that are based on linear DNAs and those that are 
based on suicide plasmids. The methods disclosed in Muyrers, J.P.P. et al., Nucl. Acids Res. 
27:1555-1557 (1999) and Datsenko, K.A. et al., Proc. Natl. Acad. Sci. 97:6640-6649 (2000) are 
linear DNA-based methods and the methods disclosed in Posfai, G. et al., J. Bacteriol. 179: 4426- 
4428 (1997) and Posfai, G. et al., Nucl. Acids Res. 27: 4409-4415 (1999) are suicide plasmid- 
based methods. 

[0041] Some known methods for deleting DNA sequences from bacterial genomes 

introduce extraneous DNA sequences into the genome during the deletion process and thus create 
a potential problem of undesired homologous recombination if any of the methods is used more 
than once in a bacterium. To avoid this problem, scarless deletion methods are preferred. By 
scarless deletion, we mean a DNA sequence is precisely deleted from the genome without 
generating any other mutations at the deletion sites and without leaving any inserted DNA in the 
genome of the organism. However, due to mistakes, such as those made in PCR amplification and 
DNA repairing processes, one or two nucleotide changes may be introduced occasionally in 
scarless deletions. Described below are some novel scarless deletion methods, either linear DNA- 
based or suicide plasmid-based. These novel methods have been applied to E. coli strains in the 
examples described below. It is understood that the specific vectors and conditions used for E. 
coli strains in the examples can be adapted by one of ordinary skill in the art for use in other 
bacteria. Similar methods and plasmids can be used to similar effect in higher organisms. In 
some instances it may be more appropriate to modify an existing production strain rather than 
transfer production to the minimized genome E. coli strain. 
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Novel linear DNA-based scarless deletion method I 



[0042] The novel DNA-based scarless deletion method of the present invention can be best 

understood when the following description is read in view of Fig. 2. Generally speaking, the 
method involves replacing a segment of the genome, marked for deletion, with an artificial DNA 
sequence. The artificial sequence contains one or more recognition sites for a sequence-specific 
nuclease such as I-Scel, which cuts at a sequence that does not occur natively anywhere in the E. 
coli K12 genome. Precise insertion of the linear DNA molecule into the genome is achieved by 
homologous recombination aided by a system that can increase the frequency of homologous 
recombination. When the sequence-specific nuclease is introduced into the bacteria, it cleaves the 
genomic DNA at the unique recognition site or sites, and only those bacteria in which a 
homologous recombination event has occurred will survive. 

[0043] Referring specifically to Fig. 2, the plasmid pSG76-CS is used as a template to 
synthesize the artificial DNA insert. The artificial insertion sequence extends between the 
sequences designated A, B and C in Fig. 2. The C R indicates a gene for antibiotic resistance. The 
insert DNA is PCR amplified from the plasmid and electroporated into the E. coli host. The insert 
was constructed so that the sequences A and B match sequences in the genome of the host which 
straddle the proposed deletion. Sequence C of the insert matches a sequence in the host genome 
just inside sequence B of the host genome. Then the bacteria are selected for antibiotic resistance, 
a selection which will be survived only by those bacteria in which a homologous recombination 
event occurred in which the artificial DNA inserted into the bacterial genome. This recombination 
event occurs between the pairs of sequences A and C. The inserted DNA sequence also includes a 
sequence B, now positioned at one end of the insert, which is designed to be homologous to a 
sequence in the genome just outside the other end of the insert, as indicated in Fig. 2. Then, after 
growth of the bacteria, the bacteria is transformed with a plasmid, pSTKST, which expresses the I- 
Scel sequence-specific nuclease. The I-Scel enzyme cuts the genome of the bacteria, and only 
those individuals in which a recombination event occurs will survive. 10-100% of the survivors 
are B to B recombination survivors, which can be identified by a screening step. The B to B 
recombination event deletes the entire inserted DNA from the genome, leaving nothing behind but 
the native sequence surrounding the deletion. 
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[0044] To repeat, the first step of the method involves providing a linear DNA molecule in 

a bacterium. The linear DNA molecule contains an artificial linear DNA sequence that has the 
following features: one end of the linear DNA sequence is a sequence identical to a genome 
sequence on the left flank of the genome region to be deleted, followed by a sequence identical to 
a genome sequence on the right flank of the genome region to be deleted; the other end of the 
linear DNA molecule is a sequence identical to a genome sequence within the genome region to 
be deleted; between the two ends of the linear DNA, there is a recognition site that is not present 
in the genome of the bacterial strain and an antibiotic selection gene. The artificial DNA sequence 
can be made using polymerase chain reaction (PCR) or directed DNA synthesis. A PCR template 
for this purpose contains the unique recognition site and the genomic DNA sequences on both 
ends of the artificial linear DNA sequence are part of the primers used in the PCR reaction. The 
PCR template can be provided by a plasmid. An example of a plasmid that can be used as a 
template is pSG76-C (GenBank Accession No. Y09893), which is described in Posfai, G. et al., J. 
Bacteriol. 179: 4426-4428 (1997). pSG76-CS (GenBank Accession No. AF402780), which is 
derived from pSG76-C, may also be used. pSG76-CS contains the chloramphenicol resistance 
(Cm R ) gene and two I-Scel sites, and was obtained by the PCR-mediated insertion of a second I- 
Scel recognition site into pSG76-C, downstream of the NotI site. The two I-Scel sites are in 
opposite direction. 

[0045] An artificial or constructed DNA sequence can be provided to a bacterium by 

directly introducing the linear DNA molecule into the bacterium using any method known to one 
of ordinary skill in the art such as electroporation. In this case, a selection marker such as an 
antibiotic resistance gene is engineered into the artificial DNA sequence for purpose of selecting 
colonies containing the inserted DNA sequence later. Alternatively, a linear DNA molecule can 
be provided in a bacterium by transforming the bacterium with a vector carrying the artificial 
linear DNA sequence and generating a linear DNA molecule inside the bacterium through 
restriction enzyme cleavage. The restriction enzyme used should only cut on the vector but not 
the bacterial genome. In this case, the artificial linear DNA sequence does not have to carry a 
selection marker because of the higher transformation efficiency of a vector so that a bacterium 
with the inserted linear DNA can be screened by PCR later directly. 
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[0046] The second step of the scarless deletion method involves replacement of a genomic 
region by insertion of the artificial DNA molecule. The bacterial cells are engineered to contain a 
system that increases the frequency of homologous recombination. An example of such a system 
is the Red recombinase system. The system can be introduced into bacterial cells by a vector. 
The system helps the linear DNA molecule to replace a genomic region which contains the 
deletion target. As described in the examples below, a vector carrying a homologous 
recombination system that can be used in E. coli is pBADccpy, which is described in Muyrers, 
J.P.P. et al., Nucl. Acids Res. 27:1555-1557 (1999). Another plasmid pKD46 described in 
Datsenko, K.A. et al., Proc. Natl Acad. Set 97:6640-6649 (2000) may also be used. Other 
plasmids that can be used include pGPXX and pJGXX. pGPXX is derived from pBADapy by 
replacing the origin of replication in pBADapy with pSClOl origin of replication. pJGXX is a 
pSClOl plasmid that encodes the Red functions from phage 933W under tet promoter control 

[0047] The third step of the scarless deletion method involves removal of the inserted DNA 

sequence. An expression vector for a sequence-specific nuclease such as I-Scel that recognizes 
the unique recognition site on the inserted DNA sequence is introduced into the bacteria. The 
sequence-specific nuclease is then expressed and the bacterial genome is cleaved. After the 
cleavage, only those cells in which homologous recombination occurs resulting in a deletion of the 
inserted linear DNA molecule can survive. Thus, bacteria with a target DNA sequence deleted 
from the genome are obtained. Examples of sequence-specific nuclease expression vectors that 
can be used in E. coli include pKSUCl , pKSUC5, pSTKST, pSTAST, pKTSHa, pKTSHc, 
pBADScel and pBADSce2. The sequence-specific nuclease carried by these vectors is I-Scel. 
pKSUCl, pKSUC5, pSTKST and pSTAST are described below in the examples. 

[0048] The method described above can be used repeatedly in a bacterium to generate a 

series of deletions. When the expression vector for the homologous recombination system and the 
expression vector for the unique sequence-specific nuclease are not compatible with each other, 
such as the case for pBADaPy and pKSUCl, transformation of the two vectors have to be 
performed for each deletion cycle. Transformation of the two vectors can be avoided in additional 
deletion cycles when two compatible plasmids, such as pBADctf3y and pSTKST, or pKD46 and 
pKSUC5, are used. An example of using two of these vectors that are compatible with each other 
is described in the examples below. 
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[0049] The above scarless deletion method can be modified to make a series of deletions on 

a bacterial genome more efficient (an example of which is Procedure 4 in Examples below). The 
first step of the modified method involves making insertions of a linear DNA molecule 
individually in bacterial cells, preferably wild-type bacteria cells, in a parallel fashion, resulting in 
a set of strains, each carrying a single insertion. This step can be carried out as described above. 
The second step of the modified method involves sequentially transferring individual insertions 
into the target cell whose genome is to be reduced. PI transduction is an example of the methods 
that can be used for transferring insertions. The third step of the modified method involves 
recombinational removal of the inserted sequence, which can be carried out as described above. 

Novel linear DNA-based scarless deletion method II 

[0050] In this novel linear DNA-based method, two DNA sequences, one of which is 

identical to a sequence that flanks one end of a bacterial genome region to be deleted and the other 
of which is identical to a sequence that flanks the other end of the bacterial genome region and 
oriented similarly, are engineered into a plasmid vector. The vector is herein termed the target 
vector. The two DNA sequences are located next to each other on the target vector. At least one 
recognition site for an enzyme that will only cut the target vector but not the bacterial genome is 
also engineered into the target vector at a location outside the two DNA sequences. The 
recognition site can be one for a sequence-specific nuclease such as I-Scel. The recognition site 
can also be one for a methylation-sensitive restriction enzyme that only cuts an unmethylated 
sequence. Since the recognition site, if there is any, on the bacterial genome is methylated, the 
restriction enzyme can only cut the target vector. The target vector is transformed into a 
bacterium and a linear DNA molecule is generated inside the bacterium by expressing in the 
bacterium the enzyme that recognizes and cuts the recognition site on the target vector. Next, a 
system that can increase homologous recombination is activated inside the bacterium to induce 
homologous recombination between the homologous sequences of the linear DNA and the 
bacterial genome that flank the region to be deleted. A bacterium with a targeted genome region 
deleted can be obtained as a result of the above homologous recombination. 

[0051] This novel linear DNA-based method can also be used to replace a region of a 

bacterial genome with a desired DNA sequence. In this case, a desired DNA sequence that can 
undergo homologous recombination with the bacterial genome to replace a region on the genome 

-15- 



is engineered into the target vector. All other aspects are the same as described above for deleting 
a region of the bacterial genome. 

[0052] Regardless whether the method is used to delete or replace a target region in the 

bacterial genome, a marker gene for selecting incorporation of DNA carried on the target vector 
into the bacterial genome is not necessary due to the high incorporation efficiency. Simply 
screening 30-100 colonies by PCR usually allows the identification of a clone with desired 
modification in the bacterial genome. 

[0053] As a specific example, Figs. 3 and 4 illustrates using this method for introducing an 

Amber stop codon in the middle of a gene. As a first step, a DNA fragment with the desired 
modifications located near the middle of the gene or chromosomal region is produced. A 
sequence-specific nuclease I-Scel recognition site is introduced at one side of the DNA fragment. 
This can be easily accomplished by including the sequence in the 5' end of PCR primers used to 
amplify the DNA fragment. Longer DNA fragments (500-5,000 nucleotides) generally work the 
best. 

[0054] The DNA fragment is cloned into a multi-copy target plasmid vector such as pUC 

19 (GenBank accession No. M77789). Since this target vector is used along with a mutagenesis 
vector as described below, the target vector is engineered to be compatible with pi 5 A origin 
plasmids (pACYC184-derived (GenBank accession No. X06403) and has a drug resistance marker 
other than chloramphenicol. These restrictions can be easily avoided by using an alternate 
mutagenesis plasmid. 

[0055] As illustrated in Fig. 4, the mutagenesis plasmid used in this example contains the 

sequence-specific nuclease I-Scel and the lambda red genes exo, beta and gam under control of 
the P-BAD promoter. The plasmid also contains pl5Aori and chloramphenicol resistance gene. 

[0056] The target and the mutagenesis plasmids are transformed into a recA positive E. 

coll The bacteria are selected for resistance to chloramphenicol and the resistance carried on the 

target plasmid. A single colony is then picked and cultured at 37°C for about 7.0 hours in 1 ml of 

Rich Defined Media (Neidhardt et al, J. Bacteriol. 1 19:736-47, which is hereby incorporated by 

reference in its entirety) containing 0.2% arabinose and chloramphenicol. A series of dilutions 

(for example, 1:1,000, 1:10,000 and so on) of cultures is then plated on a non-selective medium 

such as LB. Next, the colonies are screened for desired mutations. If a growth phenotype is 
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known, the screening can be done by patching on appropriate media. Otherwise, the screening is 
done with colony PCR followed by restriction digestion and electrophoresis or by sequencing. 

Suicide plasmid-based method 

[0057] The suicide plasmid-based method described here can be used for both scarless gene 

deletion and gene replacement. The basic element of the method involves a plasmid vector named 
Interlock plasmid that contains and antibiotic resistance gene and a replication origin under the 
control of a promoter. The Interlock plasmid also contains one or more sites at which a DNA 
insert can be inserted. When the method is used for scarless deletion, the DNA insert includes two 
DNA sequences located right next to each other, oriented similarly, one of which is identical to a 
sequence that flanks one end of a bacterial genome region to be deleted and the other of which is 
identical to a sequence that flanks the other end of the bacterial genome region. When the method 
is used for gene replacement, the DNA insert includes a sequence that will replace a segment of a 
bacterial genome. When the promoter that controls the origin of replication is turned off, the 
replication of the plasmid is shut down and the antibiotic pressure can be used to select for 
chromosomal integrations. After chromosomal integration, the promoter that controls the 
replication origin from the plasmid can be turned on and the only bacteria that can survive are 
those that a recombination event has occurred to eliminate said origin of replication, its promoter 
or both. When the DNA insert is for making scarless deletion, the majority of the recombination 
event will result in bacteria that either have the desired scarless deletion or the same genome 
before any integration. When the DNA insert is for gene replacement, the majority of the 
recombination event will result in bacteria that either have the desired replacement or the same 
genome before any integration. A screening step can then be performed to identify those bacteria 
with desired modifications in the genome. 

[0058] A variation of the above method involves the same Interlock plasmid except that the 

plasmid also contains a sequence-specific nuclease recognition site that is absent in the bacterial 
genome. After chromosomal integration, instead of activating the origin of replication control 
promoter to select for recombination events, the bacteria are engineered to express the sequence- 
specific nuclease to cut the bacterial genome and select for recombination events. 
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[0059] The suicide plasmid-based method can also be used repeatedly in a similar fashion 

as the novel liner DNA-based methods described above to generate a series of deletions on a 
bacterial genome. 

[0060] Fig. 6 shows plasmid embodiments that can be used in the suicide plasmid-based 

method. pILl is an Interlock plasmid and pBAD-Sce-1 is a plasmid for expressing a sequence- 
specific nulcease I-Scel. pIL4 is a combination of both. The tet promoter used in pILl and pIL4 
is tightly regulated and thus has advantages over other control mechanisms such as a temperature 
sensitive element which is more leaky. An example of using pIL4 for gene replacement is shown 
in Fig. 5A-C to illustrate the suicide plasmid-based method of the present invention. Fig. 5A 
shows that the insertion of a DNA insert into pIL4 and integration of pIL4 into the bacterial 
genome. With heat activated chlorotetracycline (CTC), tet repressor is inactive, the O and P 
promoter is functional, and the plasmid replicates. After removing CTC, tet repressor binds the O 
and P promoter and the replication is blocked. Chloramphenicol resistance can be used to select 
for integrants. Fig. 5B shows using the induction of the ectopic origin to select for homologous 
recombination and two possible outcomes of the homologous recombination. Fig. 5C shows the 
alternative way of selecting for homologous recombination and the two possible outcomes of the 
recombination. This alternative way involves inducing I-Scel expression to generate double- 
strand break. 

[0061] Two specific embodiments of the suicide plasmid-based method are described 

below as protocol 1 and protocol 2. Either pILl or pIL4 can be used for protocol 1, and pILl in 
combination with pBAD-Sce-1 can be used for protocol 2. One of ordinary skill in the art can 
also adapt protocol 2 for using pIL4 alone. 

Protocol UCounterselection with lambda origin): 

1 . Generate the desired genomic modification as a linear DNA fragment. In the case of making 
an Amber mutant, the modification can be made by megaprimer PCR. To make a deletion in 
the genome, a fusion of the desired endpoints of the deletion should be used. The ends of the 
DNA fragment should be phosphorylated for cloning. 

2. Create a blunt cloning site by digesting the pIL4 vector (Figs. 5 A and 6) with the restriction 
enzyme Srfl. Dephosphorylate the vector. 
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3 . Perform a blunt ligation of the desired modification and the pIL4 vector. 

4. (Note: this step is potentially dispensable in high throughput implementation.) Transform the 
ligation into a cloning strain of E coli (such as JS5). Outgrow the transformation for 1 hour 
in LB + 1 ug/ml cTc (cTc - chlortetracycline freshly autoclaved in LB media. A stock of 100 
ug/ml is autoclaved for 20 minutes and then stored in the dark at 4°C. It can be used for up to 
5 days. Alternately, a solution of 2 ng/ml of anhydrotetracycline can be substituted). Then 
plate on LB + Chloramphenicol (Cam 25 ug/mi) + cTc (1 ng/ml), and grow overnight at 
37°C. Grow colonies in equivalent media and prepare plasmid miniprep DNA. Analyze by 
gel electrophoresis and select a clone with an insert. 

5. Transform the verified plasmid into a recA positive strain of E coli (such as MG1655). 
Outgrow for 1 hour in LB + 1 ng/ml cTc. Plate a portion of the outgrowth on plates 
containing Cam and 1 ug/ml cTc. Grow overnight at 37°C. 

6. Pick a colony into 1 ml LB and plate 10 wl on a Cam plate. Grow overnight at 37°C. 

7. Streak a colony on a Cam plate to be sure that every cell present contains the integrated 
plasmid. Grow overnight at 37°C. 

8. Pick a colony into 1 ml LB and plate 1 00 ul of a 1 : 1 00 dilution on plates containing 5 ug/ml 
cTc. Grow overnight at 37°C. 

9. (Screen for mutant) Only a fraction of the counterselected colonies will contain the desired 
modification and the others will be reversions to wt. The proportion of mutant to revertant 
will depend on the location of the modification in the cloned fragment. Some kind of screen 
must be performed to identify the desired mutant. For the production of Amber mutants, the 
gene in question can be amplified by PCR and digested with Bfal restriction enzyme (Bfal 
cuts Amber codons that are preceded by a 'C'). 

Protocol 2 (High thruput counterselection with I.Scel) : 
1-4 Same as protocol 1. 

5. Co-transform the insert-carrying Interlock plasmid and pBAD-Scel into a recA positive 

strain of E coli (such as MG 1655). Outgrow for 1 hour in LB + 1 ug/ml cTc. 

-19- 



(Alternatively, the insert-carrying Interlock plasmid can be transformed on it's own into 
competent cells already carrying pBAD-Scel). 

6. Add Chlorampehnicol to 25 ug/ml and Kanamycin to 50 {j.g/ml. Grow for 1 -2 hours at 
37°C with shaking. 

7. Pellet the cells in a microcentrifuge for 30 seconds. Remove the media supernatant. 

8. (Integration step) Resuspend the cells in 1 ml LB + Chloramphenicol (25 |xg/ml) + 
Kanamycin (50 ug/ml) + Glucose (0.2%) and grow overnight at 37°C, shaking. 

9. Dilute the overnight culture 1 : 1 0,000 in the same media and grow an additional 1 6-24 
hours at 37°C. 

10. (Counter selection step) Dilute 10 ul of the culture into 1 ml lxM9 minimal salts (to 
minimize growth rate). Split this into two tubes of 0.5 ml each. To one add Arabinose to 
0.2% and to the other add Glucose to 0.2% (to serve as a negative control). Grow 1-2 
hours at 37°C with shaking. 

1 1 . Plate 1 0 ul of the Arabinose tube onto LB + Kanamycin (50 ug/ml) + Arabinose (0.2%) 
and 10 ul of the Glucose tube onto LB + Chloramphenicol (25 ug/ml) + Kanamycin (50 
Ug/ml) + Glucose (0.2%). Grow overnight at 37°C. 

1 2. (Screen for mutant) Perform step 9 of the primary protocol. 

[0062] The above disclosure generally describes the present invention. The invention will 
be more fully understood upon consideration of the following examples which are provided herein 
for purposes of illustration only and are not intended to limit the scope of the invention. 



EXAMPLES 

Plasmids 

[0063] The plasmid used for PCR construction of the artificial inserted DNA sequence was 

designated pSG76-CS (GenBank Accession No. AF402780), which was derived from pSG76-C 
(Posfai, G. et al., J. Bacteriol. 179: 4426-4428 (1997)) by inserting a second I-Scel site. The 
second I-Scel site was obtained by the PCR-mediated insertion of a second I-Scel recognition site 
into pSG76-C, downstream of the NotI site. The two I-Scel sites are in opposite direction. 
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[0064] The pB ADa(3y plasmid was used for enhancing recombination of linear DNA- 
fragments into the genome. This plasmid was described in Muyrers, J.P.P. et al. 5 Nucl. Acids Res. 
27:1555-1557(1999). 

[0065] The PKSUC1 plasmid (GenBank Accession No. AF402779), for expressing I-Scel, 

was derived from pSG76-K (Posfai, G. et al., J. Bacteriol 179: 4426-4428 (1997)) and 
pUC19RP12 (Posfai, G. et al., Nucl. Acids Res. 27: 4409-4415 (1999)). The Xbal-NotI fragment 
(carries the Kan gene; the NotI end was blunted by Klenow polymerase) of pSG76-K was ligated 
to the Xbal-Dral fragment (carries the I-Scel gene and the pUC ori) of pUC19RP12. 

[0066] The pKSUC5 plasmid for tetracycline-regulated expression of I-Scel was derived 

from pFT-K (Posfai, G. et al, J. Bacteriol. 179: 4426-4428 (1997)) and pKSUCl. The large 
Xbal-Ncol fragment of pKSUCl was ligated to the Xbal-Ncol fragment of pFT-K carrying the tet 
repressor. 

[0067] The PKD46 plasmid for enhancing recombination of linear DNA-fragments into the 

genome was described in Datsenko, K.A. et al., Proc. Natl. Acad. Sci. 97:6640-6649 (2000). 

[0068] The plasmid pSTKST (GenBank Accession No. AF406953) is a low copy number 

Kan R plasmid for chlortetracycline-regulated expression of I-Scel, derived from pFT-K (Posfai, G. 
et al., J. Bacteriol. 179: 4426-4428 (1997)) and pUC19RP12 (Posfai, G. et al., Nucl. Acids Res. 
27; 4409-4415 (1999)). The Xbal-PstI fragment from pUC19RP12, carrying the I-Scel gene, was 
ligated to the large Xbal-PstI fragment of pFT-K. This plasmid expresses I-Scel when induced by 
chlortetracycline. Replication of the plasmid is temperature-sensitive (Posfai, G. et al., J. 
Bacteriol. 179: 4426-4428 (1997)). 

[0069] The plasmid pSTAST, a low copy number Ap R plasmid for chlortetracycline- 

regulated expression of I-Scel, was derived from pFT-A (Posfai, G. et al., J. Bacteriol. 179: 4426- 
4428 (1997)) and pUC19RP12 (Posfai, G. et al., Nucl. Acids Res. 27: 4409-4415 (1999)). The 
Xbal-PstI fragment from pUC19RP12, carrying the I-Scel gene, was ligated to the large Xbal-PstI 
fragment of pFT-A. This plasmid expresses I-Scel when induced by chlortetracycline. Replication 
of the plasmid is temperature-sensitive (Posfai, G. et al., J. Bacteriol. 179: 4426-4428 (1997)). 
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Procedure 1 

[0070] This describes the process used to repeatedly make deletions from the genome of E. 

coli K12. This procedure is a scarless deletion method. The procedure begins with the 
construction of a linear target fragment by PCR. This was done by mixing 20 pmol of primer A 
with 20 pmol primer B, and performing PCR in a total volume of 50 ul. The cycle parameters 
used were 15x(94°C 40sec/57°C or lower (depending on the overlap of A and B) 40sec/72°C 
15sec). The 1 ul of the PCR mix above were taken, added to 20 pmol of primers A and C each, 
add 50 ng of pSG76-CS and perform PCR in a volume of 2x50 ul (use 50-ul tubes, and two tubes 
are combined to have more DNA). The cycle parameters used were 28x(94°C 40sec/57°C 
40sec/72°C 80sec). To purify the PCR mix from the above step, Promega Wizard PCR 
purification kit was used. The resulting DNA fragment was suspended in 20 ul water. 

[0071] Next was the replacement of a genomic region by insertion of the artificial DNA- 

fragment. This was done by taking the target cell carrying pBADafJy and preparing 
electrocompetent cells as described (Posfai, G. et al., Nucl. Acids Res. 27: 4409-4415 (1999)), 
except that 0.1% arabinose was added to the culture 0.25 - 1 hour before harvesting the cells. 4 pi 
of DNA fragments (100-200 ng) were electroporated into 40 ul of electrocompetent cells. The 
cells were plated on Cam plates (25 ug cam/ml) and incubated at 37°C. The usual result was to 
obtain a total of 10 to several hundred colonies after overnight incubation. A few colonies were 
checked for correct site insertion of the fragment by PCR using primers D and E. 

[0072] Next was the deletion of the inserted sequences. This was done by preparing 

competent cells derived from a selected colony from above by the CaCl 2 method (Sambrook, J. et 
al., Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY (1989)). The plasmid pKSUCl (-100 ng) was transformed into the cells by standard 
procedures (Sambrook, J. et al., Molecular Cloning: A Laboratory Manual. Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY (1989)). The cells were plated on Kan plates and 
incubated at 37°C (pKSUCl and pBADapy are incompatible, thus selection on Kan eliminates 
pBADccpy from the cells). The colonies were checked for correct deletion by PCR using primers 
D and E. A colony was selected carrying the correct deletion. At this point, the cells carried 
pKSUCl . The next step is to delete this plasmid. 
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[0073] This deletion is done through the replacement of pKSUCl with pBADapy. A 
colony from the prior step was selected, grown in LB at 37°C under nonselective conditions, 
reinoculating the cells into fresh medium 2-3 times. Competent cells were prepared for either 
chemical transformation or electroporation. The plasmid pBADafiy (100-200 ng) was 
transformed into the competent cells which were plated on Amp plates. A colony which was Kan 
sensitive/ Amp resistant was selected by toothpicking a hundred colonies on Kan and Amp plates. 

[0074] The selected colony can be used in a next round of deletion by using a new targeting 

fragment and repeating the steps above. If no more deletions are needed, growing the cells under 
nonselective conditions (no Amp is added) results in the spontaneous loss of pBADaPy from a 
large fraction of the cells. 

Procedure 2 

[0075] This procedure is similar to procedure 1 , but pKSUCl is replaced by pSTKST. This 

plasmid is compatible with pBADapy, has a temperature-sensitive replicon, and expression of I- 
Scel requires induction by chlortetracyclin (CTC). The advantage is that elimination of pSTKST 
from the cell is easily accomplished by growing the culture at 42°C. 

[0076] Construction of a linear targeting fragment by PCR and replacement of a genomic 

region by insertion of the fragment are done as described in Procedure 1 . 

[0077] To delete the inserted sequences competent cells are prepared from a culture derived 

from a selected colony harboring the right insertion. Cells are transformed by pSTKST , plated on 
Kan+Cam plates and incubated at 30°C. A colony from this plate is inoculated into 10 ml of 
LB+Kan supplemented with heat-treated inducer cTc (25 ug/ml final concentration) and grown at 
30°C for 24 hours. This step serves induction of the expression of I-Scel. Dilutions of the culture 
are then spread on LB+Kan plates and incubated overnight at 30°C. 6-12 colonies were checked 
for correct deletion by PCR using primers D and E. A colony was selected carrying the correct 
deletion. 

[0078] To eliminate the helper plasmids from the cell, the culture is grown at 42°C in LB 

(no antibiotics added). 
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Procedure 3 

[0079] Since pBADa(3y and pSTKST carry compatible replicons, repeated transformations 

of the plasmids are not required when consecutive deletions are made in the same host. The two 
plasmids are maintained in the host cell throughout consecutive deletion constructions by 
antibiotic selection (Kan+Amp). Recombinase and specific nuclease functions are induced only 
when needed. Since replication of pSTKST is temperature-sensitive, cells must be grown at 30°C. 

[0080] The procedure is identical to Procedure 2, except that pBADaPy and pSTKST are 

transformed into the cell only once, and until maintenance of both plasmids in the cell is desired, 
the culture is grown at 30°C, and Amp+Kan are included in the medium. Note: Sometimes we 
experienced difficulties in growing the cells at 30°C in the presence of two (Amp+Kan) or three 
(Amp+Kan+Cam) antibiotics. 

Procedure 4 

[0081] This is the preferred procedure when several consecutive deletions are to be made in 

the same cell. Insertions (recombination of linear fragments into the genome of a host cell carrying 
pBADoc(3y) are made in parallel, creating a series of recombinant cells, each carrying a single 
insertion. These insertions are then transferred one by one by PI transduction into the cell carrying 
pSTKST and harboring all previous deletions. Removal of all foreign sequences is done in this 
final host by inducing pSTKST. Compared to the previous methods, the main difference is that 
the insertion step and removal of the inserted sequences are done in separate cells. Since insertions 
are made in parallel, the construction of consecutive deletions is faster. Another advantage is that 
cells are transformed by the plasmids only at the beginning of the first deletion construction. 

[0082] Technically the procedure is identical to Procedure 2, except that individual 

insertions are transferred by PI transduction to the deletion strain already harboring pSTKST. 
After each PI transduction step, I-Scel expression is induced to remove the inserted sequences. 

Results 

[0083] Twelve consecutive genomic deletions have been made from E. coli strain K12 

MG1655. The twelve deleted regions were selected for deletion, in part, as a result of comparison 
of the genomic DNA sequences of E. coli strain 0157:H7 EDL933 and strain K-12 MG1655. The 
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deletions are listed on Table 1 below. The sequence numbering is taken from the published K12 
sequence. 

10084] The first deletion MD1 was made using the method described in Posfai, G. et al., 

Nucl. Acids Res. 27: 4409-4415 (1999). Using this method for creating MD1 deletion left a 1 14- 
bp pSG76-CS vector sequence, including a FRT site, in the chromosome at the site of deletion. 
MD2 through MD6 deletions were made using Procedure 1 described above. Deletions MD7 
through MD12 were created using a combination of Procedure 4 and Procedure 1 or 2. Strain 
designations and genomic coordinates of each new deletion were: MD1 263080-324632; MD2 
1398351-1480278; MD3 2556711-2563500; MD4 2754180-278970; MD5 2064327-2078613; 
MD6 3451565-3467490; MD7 2464565-2474198; MD8 1625542-1650865; MD9 4494243- 
4547279; MD10 3108697-3134392; MD11 1196360-1222299; MD12 564278-585331. 

[0085] A total of 378,180 base pairs, which is approximately 8.1% of the native K12 

MG1655 E. coli genome, was removed at this stage. Removing these regions from the genome 
did not affect bacterial survival or bacterial growth. 

[0086] Table 2 below lists other segments, genes and regions of the E. coli genome that 

were identified as candidates for further deletions. The seqments wer also successfully removed 
from the genome of the bacteria. Again, these deletions were made without any apparent 
deleterious effect on the usefulness of the bacteria for laboratory and industrial use. Again the 
sequence designations are taken from the published K12 sequence. The two sets of deletions 
totaled about 14% of the original bacterial genome. Further deletions are, of course, possible. 

[0087] In Procedure 1 , efficiency of the insertion of the linear fragment varied with the 

particular genomic locus. Correct-site insertion occurred in 1-100% (normally 20-100%) of the 
colonies. Flanking homologies in the range of 42 to 74 bp were used. Longer homologies give 
better insertion efficiencies. Correct-site excision between the duplicated sequences occurred in 1- 
100% (normally 10-100%) of the colonies and depended on the length of the duplicated region. 
Longer duplications are usually more effective. Length of the duplicated sequences was in the 
range of 42 to 50 bp. Variations in the efficiencies of insertion and excision existed between 
seemingly identically repeated experiments and are not fully understood yet. 

[0088] Procedure 3 was tested by re-creating deletion MD2. Correct-site insertion of the 

linear DNA-fragment occurred in 6.6 % of the colonies. Deletion of the inserted sequence was 
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very efficient. Twenty-five resulting colonies were replica plated on Cam+Amp+Kan and 
Amp+Kan plates, and 19 of them proved to be Cam sensitive. Five of these colonies were then 
tested by PCR, and all 5 showed the predicted loss of the inserted sequence. 

[0089] In the above description, the present invention is described in connection with 

specific examples. It will be understood that the present invention is not limited to these 
examples, but rather is to be construed to be of spirit and scope defined by the appended claims. 
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Description b 


b0246-b0310; includes K-islands #16, 17, 18, CP4-6. eaeH 


b1336-b1411; includes K-island #83, Rac 


b2441-b2450; includes K-island #128, CP-Eut 


b2622-b2660; includes K-island #1 37, CP4-57, ileY ! 


b1994-b2008; includes K-islands # 94, 95, 96, CP4-44 


b3323-b3338; includes K-islands #164, 165 


b2349-b2363; includes K-island #121 


b1539-b1579; includes K-island #77, Qin 


b4271-b4320; includes K-island #225, fee operon, fim ODeron 


b2968-b2987; includes K-island #153, glc operon 


b1137-b1172; includes K-island #71, e14 


b0538-b0565; includes K-island #37, DLP12 


Size (bp) 


61553 


81928 


6790 


35091 


14287 


15926 


9634 


25244 


53037 


25696 


25940 


21054 


as 

CO 
-*— * 

o 

Q_ 

XJ 

c 

LU 


263080, 324632 


1398351, 1480278 


2556711,2563500 


2754180, 2789270 


2064327, 2078613 


3451565, 3467490 


2464565,2474198 


1625542, 1650785 


4494243, 4547279 


3108697, 3134392 


1196360, 1222299 


564278, 585331 


Deletion 


MD1 


MD2 


MD3 


MD4 


MD5 


MD6 


MD7 


MD8 


MD9 


MD10 


MD11 


MD12 



TABLE 2 

SECOND SET OF COMPLETED DELETIONS 

IS 186 deletions (3) 



keep dnaJ 


14168,15298 (+) 


*deleteGPl 


15388,20563 IS186, gef, nhaAR, IS1 


[IS 186 


15388,16730] 


[IS1 


19796,20563] 


keep rpsT 


20815,21078 (-) 


keep pheP 


601182,602558 (+) 


*delete GP2 


602639,608573 ybdG, nfiiB, ubdF, ybdJ, ybdK, IS186 


{IS186 


607231,608573] 


keep entD 


608682,609311 (-) 


keep glk 


2506481,2507446 (-) 


*delete GP3 


2507650,2515969 b2389, b2390, b2391, b2392, nupC, IS186, yfeA 


[IS 186 


2512294,2513636] 


keep alaX 


2516061,2516136 (-) 



IS2 deletions (3 not already deleted) 

Keep yaiN ' 378830,379126 (-) 
*delete GP4 

[IS2 
keep hemB 
*delete GP5 



379293,387870 yaiO, b0359, IS2, b0362, yaiP, yaiS, tauABCD 
380484,381814] 
387977,388984 (-) 

389121,399029 b0370, yaiT, IS3, yaiU, yaiV, ampH, sbmA, yaiw, 



yaiY, yaiZ 



[IS3 
keep ddlA 



390933,392190] 
399053,400147 (-) 



keep ygeK 2992482,2992928 (-) 

*delete GP6 2992959,2996892 b2856, b2857, b2858, b2859, IS2, b2862, b2863 

[IS2 2994383,2995713] 

keep glyU 2997006,2997079 (-) 



keep ribB 3181 829,3 1 82482 (-) 

*delete GP7 3 1 82796,31 89712 b3042, ygiL, IS2, yqiGHI (fimbral locus) 

[IS2 3184112,3185442] 

keep glgS 3 1 89755,3 1 89955 (-) 



IS5 deletions (6 not already deleted) 

keep ybeJ 686062,686970 (-) 
*delete GP8 687074,688268 IS5 
keep lnt 688566,6901 04 (-) 
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ydaA 



keeptpx 1386329,1386835 (-) 

*delete GP9 1386912,1396646 ycjG, ycjl, ycjY, ycgZ, mppA, ynal, IS5, ynaJ, 

[IS5 1394068,1395262] 

keep for 1396798,1397550 (-) 



clusters 



keep gnd 2097884,2099290 (-) 

* delete GP1 0 209941 8,21 35739 IS5 plus entire O Antigen and Colanic Acid 

[IS5 2099771,2100965] 

keepyegH 2135858,2137507 (+) 



keep proL 228423 1,2284307 (+) 

*delete GP1 1 2284410,2288200 yejO and IS5 

[IS5 2286939,2288133] 

keep narP 2288520,22891 67 (+) 

keep gltF 335881 1,3359575 (+) 

*delete GP12 33 59747,3365277 IS5 plus yhcADEF (K-island) 

[IS5 3363191,3364385] 

keep yhcG 3365462,3366589 (+) 



keep arsC 3647867,3648292 (+) 

*delete GP13 3648921,3651343 yhis and IS5 

(IS5 3640666,3650860] 

keep sip 3651558,3652157 (+) 



flagella 
Region I 

keep mviN 1 1 27062, 1 1 28597 (+) 

*delete GP 1 4 11 28637, 1 1 40209 flgAMN flgBCDEFGHIJKL 

keep rne 1 140405,1 143590 (-) 

Region II 

keep yecT 1959975,1960484 (+) 

*delete GP1 5 1 960605, 1 977294 flh, che, mot, tap, tar, IS 1 

keep yecG 1 977777, 1 978205 (+) 

Regions Ilia and Illb try deleting both in one action 

keepsdiA 1994133,1994855 (-) 

*delete Gpl6 1995085,2021700 fli, plus amyA, yec and yed ORFs 

keep rcsA 202 1 990,202261 3 (+) 
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hsd region 



keep uxuR 
*delete GP17 
keep mdoB 



4552145,4552918 (+) 

4553059,4594581 yji ORFS, plus mcrBCD, hsdRMS, mrr, tsr 
4594719,4596971 (-) 



Rhs elements 



keep ybbP 


519640,522054 (+) 


*deleteGP18 


522062,529348 RhsD element & associated ORFs 


keep ybbB 


529356,530450 (-) 


keep ybfA 


728357,728563 (+) 


*delete GP19 


728616,738185 RhsC element & associated ORFs 


keep ybgA 


738224,738733 (+) 


keep yncH 


1524964,1525176 (+) 


Melete GP20 


1525914,1531648 RhsE element & associated ORFs 


keep nhoA 


1532048,1532893 (+) 


keep nikR 


3616219,361662 (+) 


*delete GP21 


3616623,3623309 RhsB element & associated ORFs # 


may need to leave something here to separate converging ORFs? 


keep yhhJ 


3623310,3624437 (-) 


keep yibF 


3758974,3759582 (-) 


*delete GP22 


3759620,3767868 RhsA element & associated ORFs 


keep yibH 


3767870,3769006 (-) 



the rest of the IS elements 
keep appA 
*delete GP23 

[IS1 
keep cspH 



1039840,1041138 (+) 

1041253,1049768 yccZYC (EPS), ymcDCBA (EPS?), IS1 

1049001,1049768] 

1050186,1050398 (-) 



ycdU 



keepphoH 1084215,1085279 (+) 

*delete GP24 1085329,1096603 ycdSRQPT (hms homologues), IS3, ymdE, 

[IS3 1093468,1094725] 

keep serX 1 096788, 1 096875 (-) 



keep baeR 2162298,216302 (+) 

*delete GP25 2163172,2175230 P2 remnant, IS3, gat operon 

[IS3 2168193,2169450] 

keepfbaB 2175532,2176656 (-) 



-30- 



keep yhhX 3577399,3578436 (-) 
*delete GP26 3578769,3582674 yhhYZ, IS1 

[IS1 3581059,3581826] 
keep ggt 3582712,3584454 (-) 

keepcspA 3717678,3717890 (+) 

*deleteGP27 3718262,3719704 IS 150 

[IS150 3718262,3719704] 

keep glyS 3719957,3722026 (-) 
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